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l_J ' Abstract. We determine the asymptotic proportion of minimal automata, within n-state 

rjr_^ , accessible deterministic complete automata over a fc-letter alphabet, with the uniform dis- 

tribution over the possible transition structures, and a binomial distribution over terminal 
states, with arbitrary parameter b. It turns out that a fraction ~ 1 — C(k, b) n~ k+2 of 
automata is minimal, with C(k, b) a function, explicitly determined, involving the solution 
of a transcendental equation. 
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\Q ■ 1. Introduction 

in 

To any regular language, one can associate in a unique way its minimal automaton, 
i.e. the only accessible complete deterministic automaton recognizing the language, with 
minimal number of states. Therefore the space complexity of a regular language can be seen 
as the number of states of its minimal automaton. The worst-case complexity of algorithms 
dealing with finite automata is most of times known [29]. But the average-case analysis of 
algorithms requires weighted sums on the set of possible realizations, and in particular the 
enumeration of the objects that are handled [10 . Therefore a precise enumeration is often 
required for the algorithmic study of regular languages. 

The enumeration of finite automata according to various criteria (with or without initial 
state [19], non-isomorphic |14| . up to permutation of the labels of the edges [13], with a 
strongly connected underlying graph [22], [191 123 EQ] , acyclic [23],. . . ) has been investigated 
since the fifties. 

In [19] Korshunov determines the asymptotic estimate of the number of accessible com- 
plete and deterministic n-state automata over a finite alphabet. His derivation, and even 
the formulation of the result, are quite complicated. In [3] a reformulation of Korshunov's 
result leads to an estimate of the number of such automata involving the Stirling number 
of the second kind. On the other side, in [21] a different simplification of the involved 
expressions is achieved, by highlighting the role of the Lagrange Inversion Formula in the 
analysis. 
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A natural question is to ask which is the fraction of minimal automata, among accessible 
complete and deterministic automata of a given size n and alphabet cardinality k. Nicaud 
|26j shows that, asymptotically, half of the complete deterministic accessible automata over 
a unary alphabet are minimal, thus solving the question for k = 1. Using REGAL, a 
C++-library for the random generation of automata, the proportion of minimal automata 
amongst complete deterministic accessible ones experimentally seems to be 85, 32% for a 
2-letter alphabet and more than 99,99%. for a larger alphabet [2]. 

In this paper we solve this question for a generic integer k > 2. At a slightly higher level 
of generality, we give a precise estimation of the asymptotic proportion of minimal automata, 
within n-state accessible deterministic complete automata over a ^-letter alphabet, for the 
uniform distribution over the possible transition structures, and a binomial distribution 
over terminal states, with arbitrary parameter < b < 1 (the uniform case corresponding 
to b = i). Our theoretical results are in agreement with the experimental ones. 

The paper is organized as follows. In Section [5] we recall some basic notions of automata 
theory, and we set a list a notations that will be used in the remainder of the paper. Then, 
we state our main theorem, and give a short and simple heuristic argument. In Section we 
give a detailed description of the proof structure, and its subdivision into separate lemmas. 
In Section [4] we prove in detail the most difficult lemmas, and give indications for those 
that are provable through standard methods. Finally, in Section [5] we discuss some of the 
implications of our result. 

2. Statement of the result 

For a given set E, \E\ denotes the cardinal of E. The symbol [n] denotes the canonical 
n-element set {1,2,..., n}. Let £ be a Boolean condition, the Iverson bracket \£ ] is equal 
to 1 if £ = true and otherwise. We use E(X) to denote the expectation of the quantifier 
X, and P(£) = E([£]) for the probability of the event £. For {£{\ a collection of events, we 
define a shortcut for the first moment 

m({£i}) ■■= E P &) = E ( E I^l) • ( 2J ) 

i i 

If p(c) is the probability that exactly c events occur, we have ra({£j}) = ^ c cp(c) > 
J2c>iP( c ) = 1 ~~ i- e - — 1 ~~ m ({£i})- This elementary inequality, known as 

first-moment bound, is used repeatedly in the following. 

A finite deterministic automaton A is a quintuple A = (T,,Q,5,qo,T) where Q is a 
finite set of states, S is a finite set of letters called alphabet, the transition function 5 is a 
mapping from Q x S to Q, qo G Q is the initial state and T Q Q is the set of terminal (or 
final) states. With abuse of notations, we identify T(i) = {i € 7J. 

An automaton is complete when its transition function is total. The transition function 
can be extended by morphism to all words of £*: 6(p, s) = p for any p 6 Q and for 
any u,v £ £*, 5(p,(uv)) = 5(5(p,u),v). A word u G S* is recognized by an automaton 
when <5(go,if) £ 7". The language recognized by an automaton is the set of words that it 
recognizes. An automaton is accessible when for any state p G Q, there exists a word u G X* 
such that <5((?o, ^) = p. 

We say that two states p, q are Myhill-N erode- equivalent (or just equivalent), and write 
p ~ q, if, for all finite words u, T(5(p,u)) = T(5(q,u)) [25]. This property is easily seen to 
be an equivalence relation. An automaton is said to be minimal if all the equivalence classes 
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k 


2 


3 


4 


5 


6 


Ck 


0.796812 
0.317455 


0.940480 
0.415928 


0.980173 
0.461509 


0.993023 
0.482799 


0.997484 
0.492498 



Table 1: The constants involved in the statement of Theorem l2.lt for the first values of k. 

are atomic, i.e. p </■ q for all p ^ q. Otherwise, the minimal automaton A' recognising the 
same language as A has set of states Q' corresponding to the set of equivalence classes of A. 
This automaton can be determined through a fast and simple algorithm, due to Hopcroft 
and Ullman. For this and other results on automata see e.g. p~5| [28] . 

At the aim of enumeration, the actual labeling of states in Q and letters in E is inessen- 
tial, and we can canonically assume that Q = [n], £ = [k], and qo = 1. In this case, when 
there is no ambiguity on the values of n and k, we will associate an automaton A to a pair 
(5,T), of transition function, and set of terminal states. The set of complete deterministic 
accessible automata with n states over a ^-letter alphabet is noted A n;k . 

We will determine statistical averages of quantities associated to automata A £ A n ^ k . 
This requires the definition of a measure \i>{A) over A n ^ k . The simplest and more natural 
case is just the uniform measure. We generalise this measure by introducing a continuous 
parameter. For S a finite set, the multi- dimensional Bernoulli distribution of parameter 
b over subsets S' C S is defined as fi b (S') = 6 |5,| (1 - ft) 151 " 15 ' 1 . The distribution associ- 
ated to the quantifier \S'\ is thus the binomial distribution. We will consider the family of 

measures /j,^ 1 (A) = fJ^^(S)(i^ (7"), with /x„"'jf (8) the uniform measure over the tran- 
sition structures of appropriate size, and ^\T) the Bernoulli measure of parameter b 
over Q = [n]. The uniform measure over all accessible deterministic complete automata is 
recovered setting b = ^. Superscripts will be omitted when clear. 
The result we aim to prove in this paper is 

Theorem 2.1. In the set A n ,k> with the uniform measure, the asymptotic fraction of min- 
imal automata is 

exp ( - \c k n~ k+2 ) , (2.2) 

with 

Ck = \u k k ] -ku k = ln(l - u k ) . (2.3) 

More generally, for any < b < 1, with measure fi^' k \A), the asymptotic fraction is 

exp ( - (1 - 26(1 - b))c k n- k+2 ) . (2.4) 

We singled out the constant uo k , instead of only c k , because the former appears repeat- 
edly, in the evaluation of several statistical properties of random automata. Solving (|2.3p , it 
can be written in terms of (a branch of) the Lambert VF-function, as u k = l + \W{-ke- k ), 
however the implicit definition (|2.3p is more of practical use. See Table [1] for a numerical 
table of values. 

When it is understood that |S| = k, a transition function 5 is identified with a /c-uple 
of maps (or, for short, a k-map) S a : Q — > Q, as 5 a (p) = A(p,a) (in this case, to avoid 
confusion, we use A for the /c-uple of {5 a }i<a<k)- And, clearly, a A;-map is identified with 
the corresponding vertex-labeled, edge-coloured digraph over n vertices, with uniform out- 
degree k, such that, for each vertex i £ [n] and each colour a £ [k], there exists exactly one 
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Figure 1: Left: the M-motif. Right: the three-state M-motif. The examples are for k = 3. 



edge of colour a outgoing from i. A terminology of graph theory will occasionally beused 
in the following. 

We use the word motif for an unlabeled oriented graph M, when it is intended as 
denoting the class of subgraphs of a A;-map that are isomorphic to M. The core of our proof 
is in the analysis of the probability of occurrence of certain motifs, that we now introduce. 

Definition 2.2. A M-motif M of a transition structure A is a pair of states i ^ j, and 
an ordered fc-uple of states {l a }i<a<ki such that 5 a (i) = 5 a (j) = £ a (see Figure [H left). 
Repetitions among £ a 's are allowed. 

A three-state M-motif of a transition structure A is the analogue of a M-motif, 

with three distinct states i, j and h, such that 5 a (i) = 5 a (j) = 6~ a {h) = £ a for all 1 < a < k 
(see Figure [H right). 

The reason for studying M-motifs is in the two following easy remarks: 

Remark 2.3. If the transition structure of an automaton A contains a M-motif, with states 
i, j and {£ a }, and T{i) = T(j), then i ~ j and A is not minimal. 

Remark 2.4. Consider a transition structure A containing no three-state M-motifs, and 
r M-motifs with states {i a , j a , {^«}} 1<a < r - Averaging over the possible sets of terminal 
states with the measure /X(,(7~), the probability that T{i a ) = T(j a ) for some 1 < a < r is 
1 - (26(1 - b)f. 

Our theorem results as a consequence of a number of statistical facts, on the structure of 
random automata, which are easy to believe although hard to prove. Thus, there is a short, 
non-rigorous path leading to the theorem, that we now explain. 

(1) A fraction 1 — o(l) of non-minimal automata contains two Myhill-Nerode-equivalent 
states i ~ j, which are the incoming states of a M-motif. 

(2) Random transition structures locally "look like" random /c-maps - this despite the 
highly non-local, and non-trivial, accessibility condition - the only remarkable dif- 
ference being in the distribution of the incoming degrees r of the states, p r = if 
r = 0, and ^Poiss fcWfe (r) if r > 1. 

(3) With this in mind, it is easy to calculate that the average number of M-motifs with 

r i k 

equivalent incoming states is (1 — 26(1 — b))(^)n~ k E(7 ^ 1 ^ v ' ) , at leading order 

in n, that is, \{l - 26(1 - 6)) uj k k n- k+2 . 

(4) Random transition structures also show weak correlations between distant parts, 
and M-motifs are 'small', thus, with high probability, pairs of M-motifs are non- 
overlapping. This suggests that the distribution of the number of M-motifs is a 
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Poissonian, with the average calculated above (as if the corresponding events were 
decorrelated). As a corollary, we get the probability that there are no M-motifs. By 
the first claim, on the dominant role of M-motifs, this allows to conclude. 



3. Structure of the proof 

As it often happens, what seems the easiest way to get convinced of a claim is not necessarily 
the easiest path to produce a rigorous proof. Our proof strategy will be in fact very different 
from the sequence of claims collected above. As it is quite composite, in this section we will 
outline the decomposition of the proof into lemmas, and postpone the proofs to Section [H 
Call P r are the probability, w.r.t. //&(A,T) above, that the transition structure contains 
no M-motifs, and still the automaton is non-minimal. Call -P C onfl the probability that the 
transition structure contains some three-state M-motif. Call P(r) the probability that 
the transition structure contains no three-state M-motifs, and exactly r M-motifs. Thus 

l = Pconfl + E r > ^W- 

The fraction of pairs (A, T), of transition structures A with no three-state M-motifs, 
and lists of terminal states T taken with the Bernoulli measure of parameter 6, such that 
T(i a ) = T(j a ) for some M-motif, is ^2 r P{r) (1 — (26(1 — b)) r ). As a consequence, w.r.t. 
the measure //b(.A) above, the probability that an automaton is non-minimal is 

prob(A is non-minimal) = ^ P(r) (1 - (26(1 - 6)) r ) + 0(P rarc ) + 0(P conR ) . (3 -q 

r 

If one can prove that P rare , P CO nfl = o(l — P (0)), then 

prob(A is non-minimal) = ^^P(r)(l — (26(1 — 6)) r + o(l)) . ^ 2) 

r>l 

In particular, if we can prove that P{r) = PoisSp(r)(l + o(l)), with p = ^ r rP(r), it would 
follow that 

prob(A is non-minimal) = (l - e ~ f ^ 1 ~ 7b ^~ b ^)(l + o(l)) . (3.3) 

This corresponds to the statement of Theorem 12.11 with p = Ckn~ k+2 . 

Note that our error term is not only small w.r.t. 1, but also, as important for probabil- 
ities, it is small also w.r.t. min(p, 1 — p), with p the probability of our event of interest. As, 
for an alphabet with k letters, p ~ n~ k+2 has a non-trivial scaling with size when k > 2, 
this difference is relevant. 

So we see that Theorem 12.11 is implied by 

Proposition 3.1. The statements in the following list do hold 

(1) P[r) = Poiss p (r)(l + o(l)), for some p; 

(2) p = c k n- k+2 {l + o(l)); 

(3) Pconfl = o(n- fc + 2 ); 

(4) Parc = 0(n- /C+2 ). 

This is the theorem we will ultimately prove. 

A collection of related, more explicit probabilistic statements is the following 
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Proposition 3.2. For M-motifs M , and three-state M -motifs M^ 1 , the average number of 
occurrences in uniform random transition structures is given by 

m[M] = + (3.4) 

m\M^) = -n- 2k+3 co k 2k (1 + o(l)) . (3.5) 

6 

Given that there are no three-state M-motifs, the average number of r-uples (Mi, . . . ,M r ) 
of distinct M-motifs is given by 




The proof of this proposition is postponed to Section 01 

Equation (|3.4p proves p = Ckn~ k+2 (l + o(l)) , that is, Part 2 of Proposition ^. 11 Using 
the first-moment bound, equation (|3.5|) proves P C onfl = 0(n~ k+1 ) as required for Part 3 of 
Proposition 13.11 

The result in (|3,6h concerning higher moments of M-motifs implies the proof of conver- 
gence of P(r) to a Poissonian, Part 1 of Proposition 13.11 The idea behind this claim is the 
fact that the occurrence of a M-motif with given states {i,j} (and any fc-uple {i a }) is a 
'rare' event, as it has a probability ~ n~ k , and, as the motifs are 'small' subgraphs, involv- 
ing 0(1) vertices, and parts of the transition structure A far away from each other (in the 
sense of distance on the graph) are weakly correlated, we expect the "Poisson Paradigm" 
to apply in this discussed, for example, in Alon and Spencer [U ch. 8]. A rigorous 

proof of this phenomenon can be achieved using the strategy called Brun's sieve (see e.g. [TJ 
sec. 8.3]). The verification of the hypotheses discussed in the mentioned reference is exactly 
the statement of equation (13. 6p . 

Thus, assuming Proposition [321 there is a single missing item in our 'checklist', namely, 

Part 4 of Proposition 13.11 We need to determine that P raie = o(n~ k+2 ). The idea behind 

this is that, in absence of M-motifs, with probability 1 — o(n~ k+2 ), for all pairs of states 

(i,j), the simultaneous breadth-first search trees started from i and j visit almost surely a 

large number of distinct states (for our proof, it would suffice ~ — En^WT^MI ' but ^ w ^ n 

i 

turn out to be provably at least ~ n 4(fc+1) and in fact conjecturally 0(n)). Thus, as, for 
all the pairs of homologous but distinct states, the states need to be either both or none 
terminal states, this produces a factor 1 — 26(1 — b) per pair. 

Note that we need only an upper bound on P rar e (and no lower bound), and we have 
some freedom in producing bounds, as, at a heuristic level, we expect -Prare = 
o(n~ k+2 ). Our proof strategy will exploit this fact, and the following property of accessible 
transition functions (see [7]): given a random A;-map A = {#a(*)}l<i<n,l<a<fe> the number 
of states accessible from state 1 is a random variable m = m(n,k), with average 0(n) and 
probability around the modal valu^] of order n~2. Remarkably, given that the accessible 
part has size m, then the induced transition structure is sampled uniformly among all 
transition structures of size m. 

This has a direct simple consequence: if the average number of occurrences of a family of 
events on a random fe-map is ?n[{£i}]fc-maps = 0( n_7 )) then the same average over random 
accessible transition functions of fixed size is bounded as m [{£ j}] acc . < 0(n~ 7+ 2). Actually, 



I.e., the most probable value. 
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this bound is very generous and, if needed (but this is not our case), the extra exponent 77 
could be dumped significatively with some extra effort. 

Thus, instead of proving that P rare = o(n~ k+2 ), we will define the quantity -P r ' are , 
exactly as -Prare but on random /c-maps over n states. Note that the definition of -Prare and 
P r ' are is based on two notion: not containing certain motifs, and not presenting pairs of 
Myhill-Nerode-equivalent states, and that both this notions are not confined to accessible 
automata, but are well-defined also for maps which are not accessible. Then we will prove 
that 

Proposition 3.3. P r ' are = o{n~ k+ i). 

In summary, as this proposition implies Part 4 of Proposition 13.11 Proposition 13.21 
implies Parts 1 to 3 of Proposition 13.11 and Proposition 13.11 implies our main Theorem 
12.11 providing proofs of Propositions 13.21 and 13.31 is sufficient at our purposes. This task is 
fulfilled in the following sections. 

4. Proofs of the lemmas 

Proof of Proposition 1 3. 3\ In a fc-map, we say that a state i is a sink state if S a (i) = i for all 
a. We say that two states {i, j} form a sink pair if the set 

has cardinality k + 1 or smaller. As easily seen through first-moment bound, the probability 
of having any sink state or sink pair in a random fc-map is at most of order n~ k+1 (precisely, 

the overall constant is bounded by 1 + 2 (fc-i)! )• So, at the aim of proving that P rare = 

o(n _fe+ 2), we can equivalently conditionate the fc-map not to contain any sink motif. 

We say that two states {i,j} form a quasi-sink pair if the set has cardinality k + 2. 
The average number of quasi-sink pairs in a random fc-map is of order n~ k+2 , thus this case 
must be analysed at our level of accuracy. 

There exist three families of quasi-sink pairs: those producing a M-motif, those such 
that there exists a value a such that {i,j,5 a (i),8 a (j)} are all distinct (type-1), and those 
such that for h letters of the alphabet 5 a (i) is uniquely realized in Ny, and for the remaining 
k — h letters 5 a (j) is uniquely realized in iVy (type-2). In evaluating P^ are , we have excluded 
the M-motif case, and we are left only with type-1 and type-2 quasi-sinks. Furthermore, we 
have excluded sink states, so in type-2 quasi-sinks we must have both h and k — h non-zero. 

For a type-1 quasi-sink {i,j}, define the pair following {i,j} as the pair {i',j'} such 
that i' = 5 a (i), j' = S a (j), for a the first lexicographic letter such that {i, j, 5 a (i), S a (j)} 
are all distinct. For a type-2 quasi-sink {i, j} define the pair following {i,j} as the pair 
{i',f} with i' = 5i(i), j' = 5i(j). Again, by first-moment estimate, the probability that 
there exists a quasi-sink pair {i,j}, such that also the pair following it is a quasi-sink, is 
bounded by 0(n~ k+1 ) (use at this aim that h(k — h) > in a type-2 quasi-sink), and we can 
conditionate our /c-map not to contain such motifs. If {i, j} is a quasi-sink pair, a necessary 
condition for i ~ j is that also i! ~ j' . Thus, we can bound -P r ' are by the probability that 
there exist no non-quasi-sink pairs in the /c-map. This is the formulation of the problem 
that we ultimately address. 

Consider a non-quasi-sink pair {i, j}, and construct the lexicographic breadth-first tree 
exploration, simultaneously on the two states i and j, neglecting those branches in which, 
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in one or both of the two trees, there is a state already visited by the exploration (call leaves 
these nodes). 

Call (v\,V2, ■ ■ ■) the ordered sequence of steps in the breadth-first search, at which a leaf 
node is visited. For fixed values v and h, we want to determine the probability of the event 
v h < v i conditioned to the event that the list has at least h items. By standard estimate of 
factorials, and crucially making use of the exclusion of sink and quasi-sink motifs, it can be 
proved for this quantity 

si fv(v + l)\ h ,. ^ 

prob (v h < v < - [ K . (4.1) 

hi \ n — 2v J 

Set now h = k + 1. By definition, in a non-quasi-sink pair, we certainly have at least k + 1 
entries Vj. If v = 0(n 7 ) for some < 7 < 1, we have that for each non-quasi-sink pair 
{hj} 

prob^+l < v) < 0(n-(* +1 X 1 - 2 r>) . (4.2) 
The number of non-quasi-sink pairs is bounded by 1 thus by first-moment bound 

prob(4+! < v for all {i,j}) < o(n- k+1+2 ^ k+1 ^) . (4.3) 
4( fc+ i) wc ^ nus g et prob(u^ ) 1 



For 7 < 4 ^ fc 1 ) _ 1 ^ we thus get prob(u^ 1 < v for all {i,j}) < o(n fc+ 2) as needed. Thus, we 

know that, with probability larger than 1 — o{n~ k+ z) ) all the non-quasi-sink pairs in our 
fc-map have Wfc+i > n 7 , for any 7 < jp^jy- This means that, if we truncate the breadth-first 

search tree exploration to a depth ~ 7^x 5 we have at most k leaves in the tree. Thus, for all 
the trees, we have at least ~ n 1 internal nodes, i.e. pairs of states (i',f) = (5(i,u), 5(j,u)), 
for which it is required T(i') = T(j') for i ~ j. But, as all these states appear not repeated 
in the exploration, the probability that i ~ j is bounded by an exponential of the form 
(1 — 26(1 — b)) nl , which decreases faster than any power law. The overall factor (g) fr° m 

the first-moment bound is irrelevant, and we are able to conclude that -P r ' are = o(n _fc+ 2), as 
needed. Note that this proof works not only for finite values of b in the open interval ]0, 1[ 
(as required for our purposes), but even up to b ~ n~ 7 . ■ 

Before passing to the proof of Proposition 13.21 we need to recall the relation between 
accessible deterministic complete automata and combinatorial objects known as k-Dyck 
tableaux [3], and determine a collection of statistical properties of these tableaux. 

Given the integers M and n, a tableau T in the set T[M x n] is a map from [M] to [n] 
such that: 

(1) every value y E [n] has at least one preimage; 

(2) calling xt(v) the smallest preimage, we have xt(1) < £r(2) < • • • < xt(ti). 

The tableau T may be represented graphically, on a M x n grid, by marking the M pairs 
{(x, T(x))}i< x <m ■ Then the conditions above translate as follows. There is exactly one 
marked entry per column. Mark in red the pairs (a;y(y), y), and in black the remaining ones: 
there is exactly one red entry per row, which is at the left of all black entries in the same 
row (if any), and the polygonal line connecting the red entries in sequence is monotonically 
increasing. We call the collections of positions of red and black marks respectively the 
backbone Bt and wiring part Wt of the tableau T. It is easily seen that the number of 
tableaux in T[M x n] is given by the Stirling number of second type, |> i- e -> the number 
of ways of partitioning M elements into n non-empty blocks (see e.g. [121 sec - 6.1, 7.4]). 
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f 1„ l t 1„ 2 a 2 b 2 C 3„ 3 b 3 C 4 a 4 4 4 C 5„ 5 6 5 C 6 a 6 6 6 C 7„ 7 t 7 C 8 a 8 6 8 C 9 a % 9 C 



Figure 2: Left: a tableau with n = 9 and = 3. The backbone part is in red. This tableau 
is valid because the red entries are monotonic (as shown by the orange profile), 
and /c-Dyck because they are all on the left of the green staircase line. Right: 
the associated /c-map. Backbone edges, corresponding to the breadth-first search 
tree, are thick, and wiring edges are in gray. 

The asymptotic evaluation of {^f J> 101 n large and M/n = 0(1), can be done through 

the general methods of analytic combinatorics (see e.g. [10], and in particular [IT] for this 
specific problem). A result of this calculation that we shall need is the following 

Proposition 4.1. If M,M' = nn + 0(1), with k > 1, calling uj the only solution of the 
equation — kuj = ln(l — uj) in [0, 1], 

Ml fM'l fn\M~M> 

(l + o(l)). (4.4) 



n \ n \ \ui 



For a fixed integer k, when M = N(n, k) = kn + 1, we have a special subfamily of tableaux 
in T[N x n]. A tableau is k-Dyck if xt(^) < k(£ — 1) + 1, i.e. if the backbone cells lie above 
the line of slope 1/k containing the origin of the grid. A small example of /c-Dyck tableau 
is shown in Figure [2j 

There exists a canonical bijection between /c-Dyck tableaux and transition structures A 
of accessible deterministic complete automata. It suffices to associate the indices (1,2,..., n) 
of the states to the rows of the tableaux, and the indices (e, li, . . . , 1^, • • • , n-i, . . . , n^) of 
the oriented edges of A to the columns. Then, for x = i a , the entry (x,y) is marked in 
T if and only if 5 a (i) = y, and it is part of the backbone if and only if it is part of the 
breadth-first search tree on A started at the initial state. 

Given a function f{y) : [n] — > [M], consider the restriction of the set T[M x n] to 
tableaux T in which the backbone function xt{v) is dominated by /, i.e., such that xr(y) < 
f(y) for all 1 < y < n. Call T[M x n; f] this set. Our /c-Dyck tableaux correspond to the 
special case T[N x n; f ], with f (y) := N — k{n — y + 1). A required technical lemma, 
that we state without proof, is the following 

Proposition 4.2. Take an integer n, N = 0(n), B = 0(1), and I > y/n. Let M = N-B, 
and take a function f such that f{y) = f {y) for ally < I, f{y) = f (y)—B for ally > n—£, 
and f (y) -B< f(y) < f (y) for all y. Then 

\cT[Mxn;f}\ \cT[N x n; f } \ 
\cT[M x n}\ \cT\N x n}\ ° U ' { ' ' 

With these tools at hand, we are now ready to prove Proposition 13.21 
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Proof of Proposition \3.Sk Given three distinct states i, j, h, with i < j < h, call Aiijh{T) 
the event that in the tableau T there is a three-state motif on states h} and {£ a }, for 
some £ a 's. Similarly, given 2r distinct states {(i a ,ja)}i<a<r, witn i a < j a and j a < j a+ i, 
call A^(j 1 j i; ....j r .j r .)(r) the event that in the tableau T there is a r-uple of M-motifs, such 
that the a-th motif has states i a , j a , and {£%}, for some £*'s. Proposition 13.21 consists in 
evaluating the two quantities 

J2 nMi jh } r[Nxn . j0] ; E l M (hjv,..,i r!j r)]r[Nxn-,n ■ ( 4 - 6 ) 

i<j<h (i 1 ,j 1 ;...;i r ,j r ) 

We now make a crucial remark: given a backbone structure -B, the average over all possible 
completions of the indicator variables [A^j/J (respectively l-M-(i 1 j 1 -...-i r ,j r )i) is zero if any 
column of index in the set C = {k(j — 1) + 1 + a, k(h — 1) + 1 + a}i<Q,<fc has a red mark 
(respectively, in the set C = {k(j a - 1) + 1 + a}i< a <r ; i<a<fc), otherwise, it is Yl^y^ 1 , 
where yi is the height of the backbone profile at column i. As a consequence, backbone 
structures contributing to the quantities in (|4.6p . weighted with the factor n(c) Yl ieC y^ 1 , 
correspond to generic backbone structures, weighted with the factor //(c), over (N — kr) x n 
tableaux. The correspondence is done by just erasing the columns in C. The function / is 
modified accordingly. Define 

r 

f h '-' ir (y) = f (y)-kJ2ly>ja}. (4.7) 



a=l 



Then, the precise statement of the remark above is 

\T[(N-2k) x n; , 

nM ijh } TlNxn . j0] = 1 | r[jVxn;/ ^ 1 ; ( 4 - 8 ) 

\T[(N- kr) x n;/^--'>]| , , 

MM {n , nM] \riN^n = 1 \ T[Nxn .j\ ■ ^ 

Thus, the right-hand side of (|4.8p is just the special case r = 2 of (14.91) . Of course we have 





\T[(N - kr) x n; f jl '-' jr }\ 


T[(N-kr)xn 


\T[(N -kr) x n] 


\T[Nxn;f®]\ 


T[Nxn;n 


\T[N x n] 


\T[Nxn]\ 





(4.10) 



We can apply Proposition 14. 1 1 to the rightmost ratio. Then, if the j a 's are within the range 
for application of Proposition 14.21 we can also simplify the leftmost ratio, to get 

nM ljh } T[Nxn .j 0] ~ {^) 2k ; (4.11) 

E[A* (ilJl ,., irJ0 ] mW ^ = (^) fcr • (4.12) 

As in Proposition 14.21 we just asked for £ 3> y/n, which is compatible with £ <C n, the 
fraction of 2r-uples (ii, ji; . . . ; i r , j r ) such that some j a 's are out of range is subleading, 
and, using the reasonings at the beginning of Section [3l the corresponding contribution can 
be included in P con fl. 

Then, the straightforward calculation of the number of triplets (i,j,h), and 2r-uplets 
{{ia, ja)}i<a<r, at leading order in n, allows to conclude. ■ 
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5. Algorithmic consequences 

The results obtained in this paper open new possibilities for the study in average of the 
properties of regular languages, and of the average-case complexity of algorithms applied 
to minimal automata. In this section we mention just a few among these consequences. 

Corollary 5.1. Minimal automata with n states over a k-letter alphabet can be randomly 
generated with 0(n 3/ ' 2 ) average complexity, using Boltzmann samplers. 

The random generator for complete deterministic accessible automata given in [2] is 
based on a Boltzmann sampler [9], its average complexity is 0(n 3//2 ). As from Theorem 12. II 
there is a constant proportion of minimal automata amongst accessible ones, the rejection 
method can be efficiently applied to randomly generate a minimal automaton. Note that 
such a generated has already been implemented in [2], though there were no theoretical 
result on the efficiency of this algorithm at that time. 

Corollary 5.2. For the uniform distribution on complete deterministic accessible automata, 
the average complexity of Moore's state minimization algorithm is ©(nloglogra). 

Proof. The average complexity of Moore's state minimization algorithm for the uniform 
distribution on n-state deterministic automata over a finite alphabet is 0(nloglogn) j8]. 
The upper bound for accessible automata is then obtained studying the size of the accessible 
part of a fe-random map [3 [19]. Moreover from [3] the lower bound of Moore's algorithm 
applied on minimal automata with n states is fi(nloglogn). Using Theorem 12.11 this is 
also a lower bound for complete deterministic accessible automata. ■ 

Corollary 5.3. For the uniform distribution on complete deterministic accessible automata, 
there exists a family of implementations of Hopcroft's state minimization algorithm whose 
average complexity is 0(nloglogn). 

From [8j a family of implementations of Hopcroft's state minimization algorithm are 
always faster than Moore's algorithm. The result follows from Corollary 15.21 In [5] the 
lower bound on the algorithm is proved to be 0(nlogn) for any implementation. Though it 
is still unknown whether there exists an implementation whose average complexity is 0(n). 

References 

[1] N. Alon and J. Spencer. The Probabilistic Method. 2nd ed., John Wiley, 2000. 

[2] F. Bassino, J. David and C. Nicaud. REGAL: A library to randomly and exhaustively generate au- 
tomata. In J. Holub and J. Zdarek eds, 12th International Conference Implementation and Application 
of Automata (CIAA 2007), LNCS 4783, 303-305. Springer, 2007. 

[3] F. Bassino, J. David and C. Nicaud. Average-case analysis of Moore's state minimization algorithm. 
Algorithmica, to appear. Available at http://lipn.fr/~bassino/publications.html 

[4] F. Bassino and C. Nicaud. Enumeration and random generation of accessible automata. Theor. Comput. 
Set., 381 86-104, 2007. 

[5] J. Berstel, L. Boasson and O. Carton. Continuant polynomials and worst-case behavior of Hopcroft's 

minimization algorithm. Theor. Comput. Set., 410 2811-2822, 2009. 
[6] J.R. Buchi. Weak second-order arithmetic and finite automata. Math. Logic Quart., 6 66-92, 1960. 
[7] A. Carayol and C. Nicaud. Distribution of the number of accessible states in a random deterministic 

automaton Submitted to STACS 2012. 
[8] J. David. Average complexity of Moore's and Hopcroft's algorithms. Theor. Comput. Sci. to appear. 

Available at http://www-lipn.univ-parisl3.fr/~david/ 



! Available at |http : / /regal . univ-mlv . f r/ 1 



12 F. BASSINO, J. DAVID, AND A. SPORTIELLO 



[9] P. Duchon, P. Flajolet, G. Louchard and G. Schaeffer. Boltzmann Samplers for the Random Generation 
of Combinatorial Structures. In Combinatorics, Probability, and Computing, Special issue on Analysis 
of Algorithms 13 577-625, 2004. 

[10] P. Flajolet and R. Sedgewick. Analytic Combinatorics. Cambridge Univ. Press, 2009. 

[11] I.J. Good, An Asymptotic Formula for the Differences of the Powers at Zero. Ann. Math. Stat. 32 
249-256, 1961. 

[12] R.L. Graham, D.E. Knuth and O. Patashnik. Concrete Mathematics: A Foundation for Computer 

Science. 2nd ed., Addison- Wesley, Reading, Mass., 1994. 
[13] F. Harary. Unsolved problems in the enumeration of graphs. Publ. Math. Inst. Hungar. Acad. Sci., 5 

63-95, 1960. 

[14] M.A. Harrison. A census of finite automata, Canad. Journ. of Math., 17 100-113, 1965. 
[15] J.E. Hopcroft and J.D. Ullman. Introduction to Automata Theory, Languages and Computation. 
Addison-Wesley, 1979. 

[16] J.E. Hopcroft. An nlogn algorithm for minimizing states in a finite automaton. Technical report, 
Stanford CA, USA, 1971. 

[17] R. Iranpour and P. Chacon. Basic Stochastic Processes: The Mark Kac Lectures. Macmillan Publ. Co., 
1988. 

[18] S. Kleene. Representation of Events in Nerve Nets and Finite Automata. In C. Shannon and J. McCarthy 

eds., Automata Studies, 3-42. Princeton University Press, 1956. 
[19] A.D. Korshunov. Enumeration of finite automata. Problemy Kibernetiki, 34 5-82, 1978. In Russian. 
[20] A.D. Korshunov. On the number of non-isomorphic strongly connected finite automata. Journal of 

Information Processing and Cybernetics, 9 459-462, 1986. 
[21] E. Lebensztayn. On the asymptotic enumeration of accessible automata. Discr. Math. Theor. Comp. 

Science 12 75-80, 2010 

[22] V.A. Liskovets. Enumeration of non-isomorphic strongly connected automata, Vesci Akad. Navuk BSSR, 

Ser. Fiz.-Mat. Navuk, 3 26-30, 1971. In Russian. 
[23] V.A. Liskovets. Exact enumeration of acyclic automata. In FPSAC'03, K. Eriksson, A. Bjorner and 

S. Linusson eds. Available at http: //www. i3s .unice . f r/f psac/FPSAC03/ARTICLES/5 .pdf 
[24] E.F. Moore. Gedanken experiments on sequential machines. In Automata Studies, Princeton Univ., 

129-153, 1956. 

[25] A. Nerode. Linear automaton transformations. Proc. of the American Math. Society, 9 541-544, 1958. 

[26] C. Nicaud. Average state complexity of operations on unary automata. In 24th International Symposium 
on Mathematical Foundations of Computer Science (MFCS 1999), 231-240, 1999. 

[27] R. Robinson, Counting strongly connected finite automata, In Graph theory with Applications to Algo- 
rithms and Computer Science, Y. Alavi et al. eds., Wiley, 671-685, 1985. 

[28] J. Sakarovitch, Elements de theorie des automates, Vuibert, 2003. English translation: Elements of 
Automata Theory, Cambridge Univ. Press, to appear. 

[29] S. Yu, Q. Zhuang and K. Salomaa, The state complexities of some basic operations on regular languages, 
Theoret. Comput. Sci., 125 315-328, 1994. 



ASYMPTOTIC ENUMERATION OF MINIMAL AUTOMATA 



13 



Appendix A. Details of the proof of Proposition 13.31 

We perform here a detailed derivation of equation (|4.ip . that has been omitted in the 
body of the paper. In this section we use the notation (n) c = n(n — 1) • • • (n — c + 1). 

Thus we have the simultaneous breadth-first tree exploration started at a pair {i,j} of 
states, in which we do not follow the leaf nodes, i.e., those nodes where, in one or both of 
the two trees, there is a state already visited by the exploration. 

This exploration is finite (the number of steps, Ly, being bounded by ~ n), as we cannot 
indefinitely visit new states. A string 

T (ij) i n 

{0,1,2} « is associated to this procedure, 
(just use r = when no confusion arises), with t s corresponding to the number of states 
already visited, among the two involved with the s-th step. 

The exclusion of sink and quasi-sink motifs leads to the fact that, among {t%, . . . , r^} 
there must be at least a value zero. Say a*j is the first lexicographc letter with this property. 
A further consequence is that we have at least k + 1 non-zero entries t s , for 1 < s < Ljj . 

Recognize that the sequence (vi,V2, ■ ■ ■) is exactly the ordered sequence of positions s 
at which t s > 0. Call \t\ s = Y2t<s T t- 

We now fix v <C n/2, and h = 0(1). The probability for the /i-uple (v±, . . . ,Vh) is 

P( Vl ,...,v h )= £ ^ h (n) 2vh ^ Tl jflwX, (A.l) 

{r Vj }e{l,2}h S=l 7 

We can bound from above the probability that Vh < v. 

VTob{v h <v)= £ P(vi,...,v h )< Y, E ^r( re - 2w )" |rk (ll^) 

(vi,...,v h ) (vi,...,v h ) { Tvj }€{l,2} h S'=l 

v h <v Vh<v 

* E E n(!^Ps E («-2»)-»f[(2» i + 2); 

v h <v v h <v 

(A.3) 

where in the last passage we used the fact that n ^ 2v < 1. Now we use the fact that, for 
f(vi, . . . , Vh) a positive function, 

E /K--->^)<^ E ( A - 4 ) 



Vl<...<V h V!,...,V h 

Vh<V Vj<V 



to get 



. (n-2v)- h A, n 1 fv(v + l)\ h 

v-i Vh „•— 1 v / 



Vl,...jV h j = \ 

This gives the claim in equation (|4.1[) . 
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Appendix B. Some statistical properties of tableaux 

We investigate here some statistical properties of tableaux and fc-Dyck tableaux, con- 
cerning the limit distribution of the marks, and its fluctuations. These results are interesting 
per se, and, at the aims ofthis paper, will be instrumental to determine ratios of cardinalities 
of various sets of tableaux, that, in turns, are used in the proof of Proposition 13, 2i 

Associate to the backbone part Bj> of a tableau T the sequence c = (ci, C2, ■ ■ • , c n ), as 
c y = xx(y + 1) — xx(y) — 1 (let conventionally xt{0) = and sy(n + 1) = kn + 2). The CyS 
are non- negative integers, related to the incremental steps in the tableau shape xt(u)- 

Call p,{c) the number of tableaux having the given backbone c. This quantity is easily 
determined. For c E N M , //(c) = if ^2 c y ^ (k — l)n + 1, and otherwise 

n 

Kc) = I[y Cy - (b.i) 

y=l 

The property of a tableau T = (Bt,Wt) of being /c-Dyck depends only on the backbone 
part Bt, and we can naturally talk of k-Dyck backbones. The factorization above still holds 
for fc-Dyck tableaux, if it is intended that YIb ^ s restricted to fc-Dyck backbones. 

The backbone profile has a definite limit shape for large n, that we can readily deter- 
mine. Define the function f K (y) : [0, 1] — > [0, k] 

f K (y) = lim -Ex T (ny) (B.2) 

n— yoo n 

where the average is taken w.r.t. the uniform measure on 7~[(L^J +1) X n). Provided 
that we have pointwise convergence to a differentiable function (as we will see, this is the 

case here), this function is translated into marginals on the sequence c, through = 
1 + lim^ooEc^. 

We deal with the overall constraint by introducing a Lagrange multiplier, associated to 
the horizontal width of the tableau, (to be tuned later on in order to have concentration on 
the appropriate width value [ku\ +1). The resulting measure is 

n 

^(c) = U(ooy) c y. (B.3) 

y=l 

Now the variables c y are independent geometric variables, with parameter p = coy (i.e., 
p y (c) = (1 — p)p c )- Average and variance are given by 

/ \ ^ y I 2\ / \2 uy 

The value of u) is determined by the equation Yly=i i-wy = ( K ~ l) n + 1j ^ na * 1S ' m * ne 
large n limit, 

f n dy-^- = ( K -l)n + 0(l). (B.5) 
Jo i-uy 

Through the scaling y — > y/n, oj — > un we can extract the leading contribution f^dy 1 ^ 
k — 1, and, as we have 

Y 



ujy In 1 - uY) 

dy- = -Y , (B.6) 

o 1-uy uj 
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we get that value uj k for the multiplier is the root in the interval (0,1) of the transcendental 
equation 

— X 1 -"', (B.7) 

that is, the same constants defined in (|2.3p . Then, the limit curve is just deduced from 
(|B.6P with u) = 

w „ ) = _M1^m). (B . 8) 

Note that the derivative of f K (y) at y = and y = 1 are f' K (0) = 1 and f' K (l) = j^rr, 
which are respectively smaller and larger than k, for any k > 1 real. More generally, 
{fLiv)) = 1 — u K.y is the density of backbone marks around x = f K (y). As every column 
is marked, either in red or in black, the density of black marks around x = f K (y) is u K y. As 
the position of a black mark in a given column is chosen uniformly in the range {1, . . . , y}, 
the probability of putting a black mark in a given position (x, y), provided that x > f K (y), 
is u) K /n, notably regardless of x and y. 

A further useful property of the backbone is the calculation of the variance, in the system 
with the Lagrange multiplier (and thus without the constraint Yl y c y = — l) n + 1)> which 
is given by the integral 

S{Y) = L dy J^y7 = T^Y + —^- (R9) 
Through the Central Limit Theorem we can deduce from this expression the asymptotic 
probability for fluctuations from the limit shape. For a given row y, such that t/,n-|/>l 1 
the probability of having xx(y) = l n fn(y)\ + £, is approximatively (using here the variance 
function f|R9]) ) 

1 

= exn — 

2s _ 

This is of course only the case r = 1 of the basic formulas for the r-point joint distribution 
in an inhomogeneous Wiener Process x(t), derived from the continuum limit of the sum 
of independent random variables with variance S(t), as in our case (see e.g. |17^ sec. 5.6]). 
However, this formula will be sufficient at our present purposes. 

We have now all the ingredients to prove Proposition 14.21 
Proof of Proposition We start by comparing different functions / satisfying the con- 
straint, at a fixed value M . Remark that any two such functions fi, fi differ by a number of 
cells bounded by fin, and that, if f x {y) < f 2 {y) for all y, \cT[M x n; fx] | < \cT[M x n; / 2 ]|. 
Thus, by telescoping, up to a factor Bn, it suffices to estimate the quantity 

\cT[M x n; f 2 ]\ - \cT[M x n; h}\ 

\cT[M x n]\ 

for a pair of functions fx, fa differing by a single cell in the position (x,y). This quantity 
is positive at sight. 



pg^(0 = -^exp 
y V2vrs 



s = n f (1) . (B.10) 

5(y/re)(5(l) - S{y/n)) 
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Note that the constraint on functions / forces y, n — y 3> \fn. Thus, the use of (IB.10|) 
(based on use of the Central Limit Theorem) is legitimate, and we have 



\cT[M x n 


i/alH 


cT[Mxn;f\}\ 




cT[M x n] 





~p*f ge (ky-nf(y) + 0(l)) 

(B.11) 



exp 



q / min(y,n-y) 2 X \ 



where the constant is positive at sight, and could be determined from the expressions (|B.9|) 
giving S(y/n) and 5(1) — S(y/n) (which are of order 1), and the quantity ky — nfk(y) (with 
fk(y) as in (|B.8P ). which is of order min(y, n — y). The precise value is accessible with some 
calculation, but irrelevant at our purposes. 

Now that we determined that all functions / in the appropriate range produce the same 
ratio, up to an absolute error which is exponentially small, we can evaluate this ratio, for 
a reference / of our choice. We choose, for any value a such that both a and n — a are of 
order n, 

Uv) = {Z ( ?\ tj y T~ a (B.i2) 

I / (y) - B y>n-a 
For a tableau T, call M' the value such that xt(ti — a) = M' < M — k{a + 1), the latter 
inequality being forced by the constraint xt(v) < f(y)- We can thus express the ratio 
\cT[M x n; f]\/\cT[M X n]\ in the form 

|f[Mx„,/]| = - ! V y: fc (e)[|c|=M-„l 

C 

x lx T (n-a) = M']lx T (y)<f(y)] y>n—a n—a ■ 

(B.13) 

The marginalisation on the value of M' makes the two event \xt(v) < f(y)Jy>n-a and 
\xr{y) < f(y)}y<n-a independent, and in fact the second one depends only on n — a and M', 
and the first one only on n, a and M — M' (not on B). Equivalently, asiV = kn+1 = M — B, 
we can use n, a and M' — B as independent parameters, i.e., 

\cT[M x n;f]\ 



\cT[M x n] 



E p(M')pl a (M'-B)p-(M'). (B.14) 
M'<M-k(a+l) 

where p(M') is nothing but p^^ e a {M' — nf(y)), and corresponds to the expectation of 
\xT{n — a) = M'J alone. From equation (jB.lOp we know that the leading contribution to 
p{M') is well-approximated by a Gaussian, with mean and variance 

E (#) = / (^) ; nM'f - (EM') 2 = M s ^^V>_ s , (B.15) 

where, as (M — l)/n = k + 0(l/n), up to subleading corrections we can use the parameters 
k and uu in the determination of f(y) and S(y). Similarly, as the width of the Gaussian is 
of order y/N, up to subleading corrections we can replace p(M') by p(M' — B), and write, 
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after a translation, 

\cT[Mxn;f}\ ^ 

VrfMxn = ^ p(M') P l a (M')p-(M' + B). ( B .16) 
I [ J l Af'<JV-Jfe(o+l) 

The expression analogous to (|B. 16[) . for M = N, reads 
\cT[N x n;f]\ 



\cT[N x n]| 



E KMOp+JMOp'IM'). (B.17) 

M'<iV-fc(a+l) 

As the Korshunov constant is of order 1 for all fc > 1, the values of the Gaussian p(M') are 
of order l/y/N at the maximum, and the functions Pn,a(M') and p^(M') are (respectively 
decreasing and increasing) monotonic in M', we have that these functions must be of order 
1 in the region relevant for p(M'). Actually, in this region we even have p~{M') = 1 — o(l) 
(see [H Lemma 13]), and in particular, as a corollary, p~(M') is smooth possibly up to small 
corrections. Then, the comparison of (|B.16P and (|B.17|) allows to conclude. ■ 
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